Over the last decade, an approach that has gained a lot of popularity to tackle non-parametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of probability distributions. The main goal of our work is to understand the optimality of two-sample tests constructed based on this approach. First, we show that the popular MMD (maximum mean discrepancy) two-sample test is not optimal in terms of the separation boundary measured in Hellinger distance. Second, we propose a modification to the MMD test based on spectral regularization by taking into account the covariance information (which is not captured by the MMD test) and prove the proposed test to be minimax optimal with a smaller separation boundary than that achieved by the MMD test. Third, we propose an adaptive version of the above test which involves a data-driven strategy to choose the regularization parameter and show the adaptive test to be almost minimax optimal up to a logarithmic factor. Moreover, our results hold for the permutation variant of the test where the test threshold is chosen elegantly through the permutation of the samples. Through numerical experiments on synthetic and real-world data, we demonstrate the superior performance of the proposed test in comparison to the MMD test.
translated by 谷歌翻译
我们考虑在非参数环境中对高阶希尔伯特空间的高阶估计估计。我们提出的估计器缩小了Bochner积分量的$ U $统计估计器,而不是希尔伯特领域的预指定目标元素。根据$ u $统计的内核的退化,我们构建了一致的收缩估计量,并具有快速的收敛速度,并产生了Oracle不平等,比较了$ u $统计估计器的风险及其收缩版。令人惊讶的是,我们表明,通过假设$ u $统计的内核完全退化而设计的收缩估计器也是一致的估计器,即使内核不是完全退化。这项工作涵盖并改进了Krikamol等人,2016年,JMLR和Zhou等,2019,JMVA,它仅处理繁殖的内核Hilbert Space中的平均元素和协方差操作员估计。我们还将结果专注于正常的平均估计,并表明对于$ d \ ge 3 $,拟议的估算器严格根据平均误差的样本平均值进行了改进。
translated by 谷歌翻译
概率分布之间的差异措施是统计推理和机器学习的核心。在许多应用中,在不同的空格上支持感兴趣的分布,需要在数据点之间进行有意义的对应。激励明确地将一致的双向图编码为差异措施,这项工作提出了一种用于匹配的新型不平衡的Monge最佳运输制剂,达到异构体,在不同空间上的分布。我们的配方由于公制空间之间的Gromov-Haussdrow距离而受到了原则放松,并且采用了两个周期一致的地图,将每个分布推向另一个分布。我们研究了拟议的差异的结构性,并且特别表明它将流行的循环一致的生成对抗网络(GaN)框架捕获为特殊情况,从而提供理论解释它。通过计算效率激励,然后我们将差异括起来并将映射限制为参数函数类。由此产生的核化版本被创建为广义最大差异(GMMD)。研究了GMMD的经验估计的收敛速率,并提供了支持我们理论的实验。
translated by 谷歌翻译
我们派生并分析了一种用于估计有限簇树中的所有分裂的通用,递归算法以及相应的群集。我们进一步研究了从内核密度估计器接收级别设置估计时该通用聚类算法的统计特性。特别是,我们推出了有限的样本保证,一致性,收敛率以及用于选择内核带宽的自适应数据驱动策略。对于这些结果,我们不需要与H \“{o}连续性等密度的连续性假设,而是仅需要非参数性质的直观几何假设。
translated by 谷歌翻译
内核方法是强大的学习方法,允许执行非线性数据分析。尽管它们很受欢迎,但在大数据方案中,它们的可伸缩性差。已经提出了各种近似方法,包括随机特征近似,以减轻问题。但是,除了内核脊回归外,大多数这些近似内核方法的统计一致性尚不清楚,其中已证明随机特征近似不仅在计算上有效,而且在统计上与最小值最佳收敛速率一致。在本文中,我们通过研究近似KPCA的计算和统计行为之间的权衡,研究了内核主成分分析(KPCA)中随机特征近似的功效。我们表明,与KPCA相比,与KPCA相比,与KPCA相比,近似KPCA在与基于内核函数基于其对相应的特征面积的投影相关的误差方面是有效的。该分析取决于伯恩斯坦类型的不平等现象,对自我偶和式希尔伯特·史克米特(Hilbert-Schmidt)操作员价值u统计量的运营商和希尔伯特·史克米特(Hilbert-Schmidt)规范取决于独立利益。
translated by 谷歌翻译
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.
translated by 谷歌翻译
Modelling and forecasting real-life human behaviour using online social media is an active endeavour of interest in politics, government, academia, and industry. Since its creation in 2006, Twitter has been proposed as a potential laboratory that could be used to gauge and predict social behaviour. During the last decade, the user base of Twitter has been growing and becoming more representative of the general population. Here we analyse this user base in the context of the 2021 Mexican Legislative Election. To do so, we use a dataset of 15 million election-related tweets in the six months preceding election day. We explore different election models that assign political preference to either the ruling parties or the opposition. We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods. These results demonstrate that analysis of public online data can outperform conventional polling methods, and that political analysis and general forecasting would likely benefit from incorporating such data in the immediate future. Moreover, the same Twitter dataset with geographical attributes is positively correlated with results from official census data on population and internet usage in Mexico. These findings suggest that we have reached a period in time when online activity, appropriately curated, can provide an accurate representation of offline behaviour.
translated by 谷歌翻译
Existing federated classification algorithms typically assume the local annotations at every client cover the same set of classes. In this paper, we aim to lift such an assumption and focus on a more general yet practical non-IID setting where every client can work on non-identical and even disjoint sets of classes (i.e., client-exclusive classes), and the clients have a common goal which is to build a global classification model to identify the union of these classes. Such heterogeneity in client class sets poses a new challenge: how to ensure different clients are operating in the same latent space so as to avoid the drift after aggregation? We observe that the classes can be described in natural languages (i.e., class names) and these names are typically safe to share with all parties. Thus, we formulate the classification problem as a matching process between data representations and class representations and break the classification model into a data encoder and a label encoder. We leverage the natural-language class names as the common ground to anchor the class representations in the label encoder. In each iteration, the label encoder updates the class representations and regulates the data representations through matching. We further use the updated class representations at each round to annotate data samples for locally-unaware classes according to similarity and distill knowledge to local models. Extensive experiments on four real-world datasets show that the proposed method can outperform various classical and state-of-the-art federated learning methods designed for learning with non-IID data.
translated by 谷歌翻译
This is paper for the smooth function approximation by neural networks (NN). Mathematical or physical functions can be replaced by NN models through regression. In this study, we get NNs that generate highly accurate and highly smooth function, which only comprised of a few weight parameters, through discussing a few topics about regression. First, we reinterpret inside of NNs for regression; consequently, we propose a new activation function--integrated sigmoid linear unit (ISLU). Then special charateristics of metadata for regression, which is different from other data like image or sound, is discussed for improving the performance of neural networks. Finally, the one of a simple hierarchical NN that generate models substituting mathematical function is presented, and the new batch concept ``meta-batch" which improves the performance of NN several times more is introduced. The new activation function, meta-batch method, features of numerical data, meta-augmentation with metaparameters, and a structure of NN generating a compact multi-layer perceptron(MLP) are essential in this study.
translated by 谷歌翻译
The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the background and making it difficult to focus on the actions of humans in the foreground. Structural information in the form of skeletons describing the human motion in the videos is privacy-protecting and can overcome some of the problems posed by appearance-based features. In this paper, we present a survey of privacy-protecting deep learning anomaly detection methods using skeletons extracted from videos. We present a novel taxonomy of algorithms based on the various learning approaches. We conclude that skeleton-based approaches for anomaly detection can be a plausible privacy-protecting alternative for video anomaly detection. Lastly, we identify major open research questions and provide guidelines to address them.
translated by 谷歌翻译